Overview

Dataset statistics

Number of variables14
Number of observations2036380
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory217.5 MiB
Average record size in memory112.0 B

Variable types

NUM8
BOOL3
CAT3

Warnings

PJ_IDADE has 34122 (1.7%) zeros Zeros

Reproduction

Analysis started2020-09-27 19:01:02.076267
Analysis finished2020-09-27 19:08:51.087202
Duration7 minutes and 49.01 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

CPF
Real number (ℝ≥0)

Distinct1133983
Distinct (%)55.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.342408284e+10
Minimum1163
Maximum9.999999417e+10
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:08:53.174205image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1163
5-th percentile896463468.8
Q14696805638
median1.49572958e+10
Q36.381681412e+10
95-th percentile9.23940709e+10
Maximum9.999999417e+10
Range9.999999301e+10
Interquartile range (IQR)5.912000848e+10

Descriptive statistics

Standard deviation3.27704197e+10
Coefficient of variation (CV)0.9804433485
Kurtosis-1.164674217
Mean3.342408284e+10
Median Absolute Deviation (MAD)1.388917766e+10
Skewness0.6178446598
Sum6.806413381e+16
Variance1.073900407e+21
MonotocityNot monotonic
2020-09-27T16:08:53.689461image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
596634307239< 0.1%
 
8.593706135e+10167< 0.1%
 
3077961326106< 0.1%
 
5.849690417e+1095< 0.1%
 
1.816014681e+1052< 0.1%
 
1.372061577e+1047< 0.1%
 
7.504062073e+1042< 0.1%
 
640047238041< 0.1%
 
9.959913929e+1039< 0.1%
 
98943731034< 0.1%
 
7.654310227e+1034< 0.1%
 
9.315800975e+1034< 0.1%
 
8.626575076e+1033< 0.1%
 
1.064471367e+1032< 0.1%
 
9.141596979e+1032< 0.1%
 
536327947932< 0.1%
 
110363779732< 0.1%
 
551802871731< 0.1%
 
5.359502569e+1031< 0.1%
 
315249374630< 0.1%
 
811043444430< 0.1%
 
1.083728474e+1029< 0.1%
 
122100573129< 0.1%
 
7.732149972e+1029< 0.1%
 
7.483854723e+1029< 0.1%
 
Other values (1133958)203505199.9%
 
ValueCountFrequency (%) 
11631< 0.1%
 
19101< 0.1%
 
51501< 0.1%
 
412037< 0.1%
 
441301< 0.1%
 
805271< 0.1%
 
836231< 0.1%
 
856772< 0.1%
 
886921< 0.1%
 
1002262< 0.1%
 
ValueCountFrequency (%) 
9.999999417e+102< 0.1%
 
9.999989713e+101< 0.1%
 
9.999972277e+103< 0.1%
 
9.999932312e+101< 0.1%
 
9.99992851e+102< 0.1%
 
9.999897127e+102< 0.1%
 
9.999891217e+101< 0.1%
 
9.999880712e+103< 0.1%
 
9.999863737e+101< 0.1%
 
9.999862927e+101< 0.1%
 

CNPJ
Real number (ℝ≥0)

Distinct1067801
Distinct (%)52.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.154005683e+13
Minimum455000107
Maximum9.7711797e+13
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:08:55.547676image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum455000107
5-th percentile4.2227743e+12
Q11.363548425e+13
median2.1794344e+13
Q32.90989375e+13
95-th percentile3.507963e+13
Maximum9.7711797e+13
Range9.7711342e+13
Interquartile range (IQR)1.546345325e+13

Descriptive statistics

Standard deviation1.114489914e+13
Coefficient of variation (CV)0.5174034232
Kurtosis7.19524337
Mean2.154005683e+13
Median Absolute Deviation (MAD)7.73911e+12
Skewness1.341547192
Sum4.386374092e+19
Variance1.242087768e+26
MonotocityNot monotonic
2020-09-27T16:08:56.096484image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1.379503e+12103< 0.1%
 
1.0609938e+1395< 0.1%
 
5.021025e+1291< 0.1%
 
7.774211e+1291< 0.1%
 
7.715251e+1291< 0.1%
 
5.586590002e+1180< 0.1%
 
2.276269e+1378< 0.1%
 
1.0014536e+1375< 0.1%
 
9.116860002e+1174< 0.1%
 
5.605572e+1273< 0.1%
 
1.6808908e+1373< 0.1%
 
1.3265725e+1372< 0.1%
 
1.5334477e+1371< 0.1%
 
2.8016077e+1368< 0.1%
 
1.2076338e+1365< 0.1%
 
3.265392e+1265< 0.1%
 
1.8233963e+1363< 0.1%
 
1.0627791e+1363< 0.1%
 
1.5427788e+1363< 0.1%
 
3.4882134e+1363< 0.1%
 
6.987324e+1262< 0.1%
 
5.549487e+1262< 0.1%
 
1.342356e+1361< 0.1%
 
1.4092821e+1360< 0.1%
 
9.24845e+1259< 0.1%
 
Other values (1067776)203455999.9%
 
ValueCountFrequency (%) 
4550001071< 0.1%
 
31290001531< 0.1%
 
32510001201< 0.1%
 
35740001135< 0.1%
 
50580001282< 0.1%
 
64860001751< 0.1%
 
68170001771< 0.1%
 
81510001961< 0.1%
 
92900001341< 0.1%
 
1.04780001e+101< 0.1%
 
ValueCountFrequency (%) 
9.7711797e+131< 0.1%
 
9.7554556e+131< 0.1%
 
9.7554536e+132< 0.1%
 
9.7554451e+131< 0.1%
 
9.7554433e+131< 0.1%
 
9.7554425e+131< 0.1%
 
9.7554233e+132< 0.1%
 
9.7554202e+135< 0.1%
 
9.7554128e+132< 0.1%
 
9.7554083e+131< 0.1%
 

PF_GENERO
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
1
1021882 
0
1014498 
ValueCountFrequency (%) 
1102188250.2%
 
0101449849.8%
 
2020-09-27T16:08:56.427298image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

PF_IDADE
Real number (ℝ≥0)

Distinct109
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.08843291
Minimum1
Maximum121
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:08:56.684513image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile25
Q133
median41
Q350
95-th percentile62
Maximum121
Range120
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.54921712
Coefficient of variation (CV)0.2744035907
Kurtosis-0.2870915227
Mean42.08843291
Median Absolute Deviation (MAD)8
Skewness0.3744856412
Sum85708043
Variance133.384416
MonotocityNot monotonic
2020-09-27T16:08:57.091251image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
38709643.5%
 
39699663.4%
 
37667113.3%
 
40659623.2%
 
35657943.2%
 
41649733.2%
 
34639813.1%
 
36636673.1%
 
33626843.1%
 
42619043.0%
 
32606163.0%
 
43602683.0%
 
44579372.8%
 
31558272.7%
 
45546272.7%
 
46536182.6%
 
47506612.5%
 
48504342.5%
 
30498172.4%
 
49488972.4%
 
50478992.4%
 
29476262.3%
 
51447382.2%
 
52438892.2%
 
28432342.1%
 
Other values (84)60968629.9%
 
ValueCountFrequency (%) 
130< 0.1%
 
257< 0.1%
 
369< 0.1%
 
435< 0.1%
 
526< 0.1%
 
645< 0.1%
 
725< 0.1%
 
836< 0.1%
 
953< 0.1%
 
1041< 0.1%
 
ValueCountFrequency (%) 
1217< 0.1%
 
1204< 0.1%
 
1162< 0.1%
 
1105< 0.1%
 
1099< 0.1%
 
1043< 0.1%
 
1031< 0.1%
 
1022< 0.1%
 
1018< 0.1%
 
1009< 0.1%
 

PJ_PORTE
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
1
1229773 
2
614166 
3
192441 
ValueCountFrequency (%) 
1122977360.4%
 
261416630.2%
 
31924419.5%
 
2020-09-27T16:08:57.735073image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-27T16:08:57.963382image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:58.209138image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters3
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
1122977360.4%
 
261416630.2%
 
31924419.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2036380100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
1122977360.4%
 
261416630.2%
 
31924419.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2036380100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
1122977360.4%
 
261416630.2%
 
31924419.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2036380100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
1122977360.4%
 
261416630.2%
 
31924419.5%
 

PJ_SETOR
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
1
886117 
2
815471 
3
328020 
4
 
6772
ValueCountFrequency (%) 
188611743.5%
 
281547140.0%
 
332802016.1%
 
467720.3%
 
2020-09-27T16:08:58.655594image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-27T16:08:58.952281image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:59.315975image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
188611743.5%
 
281547140.0%
 
332802016.1%
 
467720.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2036380100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
188611743.5%
 
281547140.0%
 
332802016.1%
 
467720.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2036380100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
188611743.5%
 
281547140.0%
 
332802016.1%
 
467720.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2036380100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
188611743.5%
 
281547140.0%
 
332802016.1%
 
467720.3%
 

PJ_IDADE
Real number (ℝ≥0)

ZEROS

Distinct71
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.963640381
Minimum0
Maximum89
Zeros34122
Zeros (%)1.7%
Memory size15.5 MiB
2020-09-27T16:08:59.856519image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q310
95-th percentile25
Maximum89
Range89
Interquartile range (IQR)7

Descriptive statistics

Standard deviation7.646088427
Coefficient of variation (CV)0.9601247746
Kurtosis5.459204121
Mean7.963640381
Median Absolute Deviation (MAD)3
Skewness2.108002061
Sum16216998
Variance58.46266823
MonotocityNot monotonic
2020-09-27T16:09:00.301225image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
223950511.8%
 
31866029.2%
 
11726918.5%
 
51707178.4%
 
41685118.3%
 
61434787.0%
 
71388106.8%
 
81253676.2%
 
101220886.0%
 
91203895.9%
 
11429582.1%
 
12341521.7%
 
0341221.7%
 
13293681.4%
 
14239601.2%
 
15232861.1%
 
16214531.1%
 
18190530.9%
 
17190210.9%
 
19184740.9%
 
20171810.8%
 
21171340.8%
 
23150100.7%
 
22145910.7%
 
24132220.6%
 
Other values (46)1052375.2%
 
ValueCountFrequency (%) 
0341221.7%
 
11726918.5%
 
223950511.8%
 
31866029.2%
 
41685118.3%
 
51707178.4%
 
61434787.0%
 
71388106.8%
 
81253676.2%
 
91203895.9%
 
ValueCountFrequency (%) 
893< 0.1%
 
791< 0.1%
 
722< 0.1%
 
712< 0.1%
 
702< 0.1%
 
692< 0.1%
 
683< 0.1%
 
642< 0.1%
 
623< 0.1%
 
6113< 0.1%
 

PJ_NUM_FUNCIONARIOS
Real number (ℝ≥0)

Distinct101
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.689320264
Minimum0
Maximum100
Zeros543
Zeros (%)< 0.1%
Memory size15.5 MiB
2020-09-27T16:09:00.792427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q32
95-th percentile10
Maximum100
Range100
Interquartile range (IQR)1

Descriptive statistics

Standard deviation5.630983411
Coefficient of variation (CV)2.093831473
Kurtosis86.76980429
Mean2.689320264
Median Absolute Deviation (MAD)0
Skewness7.688172723
Sum5476478
Variance31.70797417
MonotocityNot monotonic
2020-09-27T16:09:01.277570image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1146675572.0%
 
21796378.8%
 
3781423.8%
 
5603123.0%
 
4537732.6%
 
10300681.5%
 
6290081.4%
 
8199101.0%
 
7177280.9%
 
20127250.6%
 
9111960.5%
 
12102330.5%
 
1590950.4%
 
1158360.3%
 
1348420.2%
 
1444470.2%
 
3037540.2%
 
1635010.2%
 
1933760.2%
 
1832570.2%
 
1724180.1%
 
2523400.1%
 
2219930.1%
 
4014910.1%
 
2114720.1%
 
Other values (76)190710.9%
 
ValueCountFrequency (%) 
0543< 0.1%
 
1146675572.0%
 
21796378.8%
 
3781423.8%
 
4537732.6%
 
5603123.0%
 
6290081.4%
 
7177280.9%
 
8199101.0%
 
9111960.5%
 
ValueCountFrequency (%) 
100639< 0.1%
 
9979< 0.1%
 
9833< 0.1%
 
9733< 0.1%
 
9625< 0.1%
 
9526< 0.1%
 
945< 0.1%
 
9326< 0.1%
 
9251< 0.1%
 
9110< 0.1%
 

CANAL_ATENDIMENTO
Real number (ℝ≥0)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.672515935
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:09:01.651362image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.181288277
Coefficient of variation (CV)0.7062941837
Kurtosis3.254986087
Mean1.672515935
Median Absolute Deviation (MAD)0
Skewness1.975144269
Sum3405878
Variance1.395441994
MonotocityNot monotonic
2020-09-27T16:09:01.950064image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
1134067865.8%
 
236411317.9%
 
31343626.6%
 
4878444.3%
 
5737863.6%
 
6355971.7%
 
ValueCountFrequency (%) 
1134067865.8%
 
236411317.9%
 
31343626.6%
 
4878444.3%
 
5737863.6%
 
6355971.7%
 
ValueCountFrequency (%) 
6355971.7%
 
5737863.6%
 
4878444.3%
 
31343626.6%
 
236411317.9%
 
1134067865.8%
 

TEMA_ATENDIMENTO
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.788784019
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:09:02.262330image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median4
Q35
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.205156225
Coefficient of variation (CV)0.5820221513
Kurtosis-0.4210133738
Mean3.788784019
Median Absolute Deviation (MAD)1
Skewness0.3937365601
Sum7715404
Variance4.862713978
MonotocityNot monotonic
2020-09-27T16:09:02.544781image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
460432629.7%
 
152000225.5%
 
539677019.5%
 
21618927.9%
 
71315796.5%
 
9758603.7%
 
8613343.0%
 
3439012.2%
 
6407162.0%
 
ValueCountFrequency (%) 
152000225.5%
 
21618927.9%
 
3439012.2%
 
460432629.7%
 
539677019.5%
 
6407162.0%
 
71315796.5%
 
8613343.0%
 
9758603.7%
 
ValueCountFrequency (%) 
9758603.7%
 
8613343.0%
 
71315796.5%
 
6407162.0%
 
539677019.5%
 
460432629.7%
 
3439012.2%
 
21618927.9%
 
152000225.5%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
1959999 
1
 
76381
ValueCountFrequency (%) 
0195999996.2%
 
1763813.8%
 
2020-09-27T16:09:02.834369image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
1108433 
2
927947 
ValueCountFrequency (%) 
0110843354.4%
 
292794745.6%
 
2020-09-27T16:09:03.091328image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-27T16:09:03.299400image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:09:03.540322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0110843354.4%
 
292794745.6%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2036380100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0110843354.4%
 
292794745.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2036380100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0110843354.4%
 
292794745.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2036380100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0110843354.4%
 
292794745.6%
 

INSTRUMENTO_ATENDIMENTO
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.595919229
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size15.5 MiB
2020-09-27T16:09:03.800417image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile3
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8875848309
Coefficient of variation (CV)0.5561589927
Kurtosis1.221163122
Mean1.595919229
Median Absolute Deviation (MAD)0
Skewness1.370494125
Sum3249898
Variance0.787806832
MonotocityNot monotonic
2020-09-27T16:09:04.054359image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
1128392763.0%
 
236730218.0%
 
332921516.2%
 
4359581.8%
 
5199781.0%
 
ValueCountFrequency (%) 
1128392763.0%
 
236730218.0%
 
332921516.2%
 
4359581.8%
 
5199781.0%
 
ValueCountFrequency (%) 
5199781.0%
 
4359581.8%
 
332921516.2%
 
236730218.0%
 
1128392763.0%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
1810516 
1
225864 
ValueCountFrequency (%) 
0181051688.9%
 
122586411.1%
 
2020-09-27T16:09:04.314365image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Interactions

2020-09-27T16:05:35.246768image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:38.649683image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:41.396898image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:44.453431image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:47.155986image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:49.637412image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:51.805093image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:54.445977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:57.194458image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:05:59.995059image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:02.769346image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:06.207405image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:09.032567image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:12.068059image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:14.590417image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:17.461983image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:20.240170image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:23.130689image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:26.090678image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:29.764011image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:32.273187image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:34.870603image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:38.153096image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:40.924498image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:43.859079image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:46.490170image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:50.074443image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:53.101359image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:55.637004image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:06:58.342487image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:01.035873image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:03.509046image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:06.081357image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:08.726618image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:11.325351image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:13.874781image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:16.289895image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:18.827976image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:21.650056image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:24.071428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:26.394650image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:28.859277image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:31.403570image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:34.060302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:36.365209image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:38.850351image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:41.403243image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:43.639322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:46.629404image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:49.612123image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:52.198486image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:54.776717image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:57.249121image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:07:59.769472image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:02.138690image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:04.419344image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:07.233641image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:09.941337image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:12.633092image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:15.269914image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:17.772928image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:20.355620image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:22.652154image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:25.044999image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-09-27T16:09:04.610134image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-27T16:09:05.353377image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-27T16:09:06.069488image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-27T16:09:06.914691image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-27T16:09:07.579009image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-27T16:08:29.738392image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T16:08:36.004344image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

CPFCNPJPF_GENEROPF_IDADEPJ_PORTEPJ_SETORPJ_IDADEPJ_NUM_FUNCIONARIOSCANAL_ATENDIMENTOTEMA_ATENDIMENTOABORDAGEM_ATENDIMENTOCATEGORIA_ATENDIMENTOINSTRUMENTO_ATENDIMENTOMEIO_ATENDIMENTO
01.376741e+091.181879e+1313612101250211
19.055791e+106.295390e+1104521252270211
25.763942e+093.089587e+131371123150210
32.800365e+092.679087e+131411331110010
41.235194e+102.761945e+130231131110210
56.796476e+103.094479e+130481221110010
66.930782e+093.650945e+131311301670020
79.257545e+083.426127e+1213931211350020
89.141819e+101.853110e+131331371150211
97.223612e+101.155983e+1313912101140010

Last rows

CPFCNPJPF_GENEROPF_IDADEPJ_PORTEPJ_SETORPJ_IDADEPJ_NUM_FUNCIONARIOSCANAL_ATENDIMENTOTEMA_ATENDIMENTOABORDAGEM_ATENDIMENTOCATEGORIA_ATENDIMENTOINSTRUMENTO_ATENDIMENTOMEIO_ATENDIMENTO
20363705.494597e+092.835837e+12038232219160010
20363712.549322e+095.047121e+1202621361551250
20363727.949333e+104.142210e+1314221281540210
20363735.258084e+091.224097e+13129221020250230
20363749.603447e+102.175042e+131361351140010
20363755.382794e+091.441622e+131272193150020
20363768.491391e+101.195691e+1314312101150010
20363771.124962e+109.502510e+11127312511350020
20363781.181559e+101.090873e+1312231113140230
20363797.281580e+102.849556e+131351131150210